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WAVE SOURCE DIRECTION DETERMINATION WITH SENSOR ARRAY 



1. FIELD OF THE INVENTION 

5 The present invention relates generally to signal processing, and more 

specifically to a system and method using signal processing for finding the direction 
of a particular wave source using an array of sensors. All documents cited herein, and 
all documents referenced in documents cited herein, are hereby incorporated by 
reference. 

10 

2. BACKGROUND OF THE INVENTION 

A system for finding or tracking the direction of a particular wave 
source has many applications. One example is a directional microphone system, 
where a microphone is to be pointed to the direction of a particular sound source. 

15 Another is a video conferencing system, where a camera needs to be moved to the 
direction of the participating speaker. 

One well-known technique of wave-source direction finding is 
beamforming. Beamforming, itself well-known in the art, uses an array of sensors 
located at different points in space. Connected to the array of sensors is a spatial filter 

20 that combines the signals received from the sensors in a particular way so as to either 
enhance or suppress signals coming from certain directions relative to signals from 
other directions. 

Where the sensors are microphones, unless two microphones are 
located at equidistant from a sound source (i.e., arranged so that the line connecting 

25 the two microphones is perpendicular to the direction of the sound source), sound 
originating from the sound source arrives at any two microphones at different times, 
thereby producing a phase difference in the received signals. 

If the received signals are appropriately delayed and combined by 
changing the spatial filter coefficients, the behavior of the microphone array can be 

30 adjusted such that it exhibits maximum receiving sensitivity toward a particular 

direction. In other words, the direction of maximum receiving sensitivity (so called 
"looking direction" of the microphone array) can be steered without physically 
changing the direction of the microphone array. It is then possible to determine the 
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direction of a particular sound source by logically (computationally) steering the 
looking direction of the microphone array over all directional angles and looking for 
the angle that produces the maximum signal strength. 

However, the use of beamforming for sound-source direction finding 
5 has several drawbacks. First, a typical beamforming profile of receiving sensitivity 
over the angles of looking direction is so flat that it is, as a practical matter, difficult 
to find the peak point of maximum signal strength unless an inconvenience of a large 
microphone array is used. For example, the 3-dB attenuation points (reference points 
for signal discrimination) of a typical 15-cm microphone array may be separated by as 
10 much as 100 degrees. At small angles such as 5 degrees, the corresponding 

attenuation is insignificant. As a result, even a slight numerical error or noise may 
perturb the result, giving an erroneous direction. 

Second, beamforming involves scanning the space for the direction 
producing the maximum received signal strength. Finding the source direction in 
1 5 terms of a horizonal direction (azimuth) and a vertical direction (elevation) involves 
searching two-dimensional space, which is computationally expensive. 

Third, in order to determine the source direction with a high spatial 
accuracy, it is necessary to perform the beamforming calculation at a very high 
resolution (for example, every 1 degree). This requires delaying and summing the 
20 received signals at a very small delay step, which, in turn, requires that the signal be 
sampled at a very high sampling rate, imposing a severe computational burden. 

Another method of finding the direction of a sound source is to 
measure time delays between a pair of sensors. For example, Hong Wang & Peter 
Chu, Voice Source Localization for Automatic Camera Pointing System in 
25 Videoconferencing, Proc. IEEE International Conference on Acoustics, Speech, and 
Signal Processing, April 1997, pp. 187-90, disclose an array of microphones mounted 
on a vertical plane, three of them arranged in a horizonal line, and the fourth located 
above the center one of the three, the horizontal direction of a sound source 
(azimuth) is calculated by measuring the time delays of incoming signals between the 
30 two remote microphones in the horizontal line. The vertical direction (elevation) is 
calculated by measuring the time delays of incoming signals between the center 
microphone in the horizontal line and the upper microphone. 
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The Wang & Chu system has several drawbacks. First, since all the 
microphones are on the same vertical plane, they produce the same delay whether 
sound is coming from the front or from the back. Since the system cannot distinguish 
between a front and a back, ambiguities are inevitable. 

5 Second, the performance of the system is not symmetric with respect to 

looking sideways and looking forward. The capability of such a system to resolve 
and estimate the direction of a source depends on the change of time delay in response 
to an incremental change in the angular direction. The time delay between two 
incoming signals at two adjacent microphones is: 

10 time delay = sin((J0 * aperture / sound-velocity, 

where <j> is the angle of arrival of the sound waves, measured with respect to the 
normal of the microphone array, and the aperture is the spacing between two nearest 
microphones. Note that the change in time delay is obtained as the derivative of 
sin(<|>), which is a function of cos(<j>). For the same incremental angular change, the 

1 5 resulting time delay when the looking direction is sideways (when 4» approaches 90 
degrees) is smaller than when the looking direction is forward (when 4 approaches 0 
degree). As a result, the performance of the system looking sideways is poorer than 

that looking forward. 

Third, the Wang & Chu system does not provide any indication of how 
20 reliable the measurements used for the direction determination were. The time delay 
measurement between a pair of microphones may not be reliable if it was measured in 
the presence of noise or based on non-relevant signals. The quality of measurement 
would be poor if the measurement were made in a noisy environment. Also, even if 
the measurement were of high quality, it may not be relevant to the direction 

25 determination. For example, if the time delay measurement were a measurement of 
reflected sound from a wall or furniture, or sound from a repeater source such as a 
loud speaker connected to an audio/video conferencing system, the measurement may 
not even be relevant to the direction determination. The Wang & Chu system does 
not provide any mechanism for verifying the quality or relevancy of measurement. 

30 Therefore, there exists a need for a system and method that can 

determine the direction of a wave source accurately and efficiently, and that can also 
indicate the quality and relevancy of the measurements on which the direction 
determination is based. 
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3. SUMMARY OF THE INVENTION 

Accordingly, it is an object of the present invention to provide a 
direction finding system capable of processing signals from an array of sensors to 
determine the direction of a particular wave source. 
5 Another object of the invention is to provide a system which can 

estimate the wave-source direction by taking into account the directions measured 
from individual microphone pairs and combining them to find the best-estimated 
source direction. 

Still another object is to provide a system which can verify the quality 
1 0 of measurements used to calculate the source direction and can disqualify the source 
direction if it is not valid under a proper measurement criterion. 

These and other objects are achieved in accordance with the present 
invention, which is an apparatus for using an array of sensors for finding the 
approximate direction of the wave source in terms of the positions of a selected subset 
1 5 of sensors, using the approximate direction so found, determining the precise 

direction of the wave source, evaluating the validity of the precise direction using a 
measurement criterion, and disqualifying the precise direction if the measurement 
criterion is not met so that the measurements can be repeated. One preferred 
embodiment of the present invention comprises an array of analog microphones for 
20 sensing sound from a sound source, an A-to-D converter for sampling the analog 
signals to produce corresponding digital signals, a bandpass filter for filtering the 
digital signals to band-limited signals in a frequency band of interest, an approximate- 
direction finder for finding the approximate direction of the sound source, a precise- 
direction finder for finding the precise direction of the sound source based on the 
25 approximate direction, and a measurement qualification unit for verifying the validity 
of the precise direction using a certain measurement criterion and for disqualifying the 
measurement if the measurement criterion is not satisfied. 

The present invention has the advantage of being computationally 
efficient because it does not involve a two- dimensional search of space, as a 
30 beamformer would require. It also has the advantage of performing reliably in a noisy 
environment because it verifies the validity of the source direction under a variety of 
measurement criteria and repeats the measurements if necessary. 



WO 99/53336 



PCT/US99/08012 



5 

The present invention may be understood more folly by reference to 
the following figures, detailed description and illustrative examples intended to 
exemplify non-limiting embodiments of the invention. 

5 4. DESCRIPTION OF THE FIGURES 

Fig. 1 is a functional diagram of the overall system including a 
microphone array, an A-to-D converter, a band-pass filter, an approximate-direction 
finder, a precise-direction finder, and a measurement qualification unit in accordance 
with the present invention. 
10 Fig. 2 is a perspective view showing the arrangement of a particular 

embodiment of the microphone array of Fig. 1. 

Fig. 3 is a functional diagram of an embodiment of the approximate- 
direction finder of Fig. 1. 

Fig. 4 is a functional diagram of an embodiment of the precise- 
1 5 direction finder of Fig. 1 . 

Fig. 5 is the 3-D coordinate system used to describe the present 

invention. 

Fig. 6A is a functional diagram of a first embodiment of the 
measurement qualification unit of Fig. 1. 
20 Fig. 6B is a functional diagram of a second embodiment of the 

measurement qualification unit of Fig, 1. 

Fig. 6C is a functional diagram of a third embodiment of the 
measurement qualification unit of Fig. 1 . 

Fig. 6D is a functional diagram of a fourth embodiment of the 
25 measurement qualification unit of Fig. 1 . 

Figs. 7A - 7D are a flow chart depicting the operation of a program 
that can be used to implement the method in accordance with the present invention. 

5. DETAILED DESCRIPTION 

30 FIG. 1 shows the functional blocks of a preferred embodiment in 

accordance with the present invention. The embodiment deals with finding the 
direction of a sound source, but the invention is not limited to such. It will be 
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understood to those skilled in the art that the invention can be readily used for. finding 
the direction of other wave sources such as an electromagnetic wave source. 

The system includes an array of microphones 1 that sense or measure 
sound from a particular sound source and that produce analog signals 7 representing 

5 the measured sound. The analog signals 7 are then sampled and converted to 
corresponding digital signals 8 by an analog-to-digital (A-to-D) converter 2. The 
digital signals 8 are filtered by a band-pass filter 3 so that the filtered signals 9 contain 
only the frequencies in a specific bandwidth of interest for the purpose of detennining 
the direction of the sound source. The filtered signals 9 are then fed into an 

10 approximate-direction finder 4 which calculates an approximate direction 10 in terms 
of a microphone pair selected among the microphones. The precise-direction finder 5 
estimates the precise-direction 1 1 of the sound source based on the approximate 
direction. The validity of the precise-direction 11 is checked by a measurement 
qualification unit 6, which invalidates the precise direction if it does not satisfy a set 

1 5 of measurement criteria. Each functional block is explained in more detail below. 

5.1. Microphone Array 

FIG. 2 shows an example of the array of microphones 1 that may used 
in accordance with the present invention. The microphones sense or measure the 

20 incident sound waves from a sound source and generate electronic signals (analog 
signals) representing the sound. The microphones may be omni, cardioid, or dipole 
microphones, or any combinations of such microphones. 

The example shows a cylindrical structure 21 with six microphones 22- 
27 mounted around its periphery, and an upper, center microphone 28 mounted at the 

25 center of the upper surface of the structure. The upper, center microphone is optional, 
but its presence improves the accuracy of the precise direction, especially the 
elevation angle. Although the example shows the subset of microphones in a circular 
arrangement of the microphone array, the microphone array may take on a variety of 
different geometries such as a linear array or a rectangular array. 
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5.2 A-to-D Converter 

The analog signals representing the sound sensed or measured by the 
microphones are converted to digital signals by the A-to-D converter 2, which 
samples the analog signals at an appropriate sampling frequency. The converter may 
5 employ a well-known technique of sigma-delta sampling, which consists of 

oversampling and built-in low-pass filtering followed by decimation to avoid aliasing, 
a phenomenon due to inadequate sampling. 

When an analog signal is sampled, the sampling process creates a 
mirror representation of the original frequencies of the analog signal around the 
10 frequencies that are multiples of the sampling frequency. "Aliasing" refers to the 
situation where the analog signal contains information at frequencies above one half 
of the sampling frequency so that the reflected frequencies cross over the original 
frequencies, thereby distorting the original signal. In order to avoid aliasing, an 
analog signal should be sampled at a rate at least twice its maximum frequency 
1 5 component, known as the Nyquist frequency. ' 

In practice, a sampling frequency far greater than the Nyquist 
frequency is used to avoid aliasing problems with system noise and less-than-ideal 
filter responses. This oversampling is followed by low-pass filtering to cut off the 
frequency components above the maximum frequency component of the original 
20 analog signal. Once the digital signal is Nyquist limited, the rate must be reduced by 
decimation. If the oversampling frequency is n times the Nyquist frequency, the rate 
of the digital signal after oversampling needs to be reduced by decimation, which 
takes one sample for every n samples input. 

An alternative approach to avoid aliasing is to limit the bandwidth of 
25 signals using an analog filter that halves the sampling frequency before the sampling 
process. This approach, however, would require an analog filter with a very sharp 
frequency cut-off characteristic. 
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5.3 Bandpass Filter 

The purpose of the bandpass filter 3 is to filter the signals sensed or 
measured by the microphones so that the filtered signals contain those frequencies 
optimal for detecting or determining the direction of the signals. Signals of too low a 

5 frequency do not produce enough phase difference at the microphones to accurately 
detect the direction. Signals of too high a frequency have less signal energy and are 
thus more subject to noise. By suppressing signals of the extreme high and low 
frequencies, the bandpass filter 3 passes those signals of a specific bandwidth that can 
be further processed to detect or determine the direction of the sound source. The 

10 specific values of the bandwidth depends on the type of target wave source. If the 
source is a human speaker, the bandwidth may be between 300 Hz and 1500 Hz 
where typical speech signals have most of their energy concentrated. The bandwidth 
may also be changed by a calibration process, a trial-and-error process. Instead of 
using a fixed bandwidth during the operation, initially a certain bandwidth is tried. If 

1 5 too many measurement errors result, the bandwidth is adjusted to decrease the 
measurement errors so as to arrive at the optimal bandwidth. 

5.4 Direction Estimation 

For efficiency of computation, the system first finds the approximate 
20 direction of the sound source, without the burden of heavy computation, and 

subsequently calculates the precise direction by using more computation power. The 
approximate direction is also used to determine the subset of microphones that are 
relevant to subsequent refinement of the approximate direction. In some 
configurations, some of the microphones may not have a line of sight to the source, 
25 and thus may create phase errors if they participate in further refinement of the 

approximate direction. Therefore, a subset of microphones are selected that would be 
relevant to further refinement of the source direction. 
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5.4.1 Approximate-Direction Finding 

FIG. 3 shows the approximate-direction finder 4 in detail. It is based 
on the idea of specifying the approximate direction of the sound source in terms of a 
direction perpendicular to a pair of microphones. Let peripheral microphone pairs be 

5 the microphones located adjacent to each other around the periphery of the structure 
holding the microphones, except that a microphone located at the center of the 
structure, if any, are excluded. For each peripheral microphone pair, "pair direction" 
is defined as the direction in the horizontal plane, pointing from the center of the pair 
outward from the structure, perpendicular to the line connecting the peripheral 

10 microphone pair. 

"Sector direction" is then defined as the pair direction closest to the 
source direction, selected among possible pair directions. If there are n pairs of 
peripheral microphones, there would be n candidates for the sector direction. 

The sector direction corresponding to the sound source is determined 

15 using a zero-delay cross-correlation. For each peripheral microphone pair, a 
correlation calculator 31 calculates a zero-delay cross-correlation of two signals 
received from the microphone pair, Xj(t) and Xj(t). It is known to those skilled in the 
art that such a zero-delay cross-correlation function, Rij(0), over a time period T can 
be defined by the following formula: 

20 

T 

Ry<0)= 2 Xi(t)Xj(t) 
t=0 

25 It is noted that a correlation calculator is well-known to those skilled in the art and 

may be available as an integrated circuit. Otherwise, it is well-known that such a 

correlation calculator can be built using discrete electronic components such as 

multipliers, adders, and shift registers. 

Among the peripheral microphone pairs, block 32 finds the sector 
30 direction by selecting the microphone pair that produces the maximum correlation. 
Since the signals having the same or similar phase are correlated with each other, the 
result is to find the pair with the same phase (equi-phase) or having the least phase 
difference. Since the plane of equi-phase is perpendicular to the propagation direction 
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of the sound wave, the pair direction of the maximum correlation pair is, then, the 
sector direction, i.e., the pair direction closest to the source direction. 

Once the sector direction is found, block 33 identifies the microphones 
that participate in further refinement of the approximate direction. "Sector" is defined 

5 as the subset of the microphones in the microphone array, which participate in 

calculating the precise direction of the sound source. For example, where some of the 
microphones in the array are blocked by a mechanical structure, the signals received 
by those microphones are not likely to be from direct-travelling waves, and thus such 
microphones should be excluded from the sector. 

10 In one preferred embodiment, the sector includes the maximum- 

correlation peripheral microphone pair, another peripheral microphone adjacent to the 
pair, and a center microphone, if any. Of two peripheral microphones adjacent to the 
maximum-correlation peripheral microphone pair, the one with a higher zero-delay 
cross-correlation is selected. The inclusion of the center microphone is optional, but 

15 the inclusion helps to improve the accuracy of the source direction, because otherwise 
three adjacent microphones would be arranged almost in a straight line. There may be 
other ways of selecting the microphones to be included in the sector, and the 
information about such selection schemes may be stored in computer memory for an 
easy retrieval during the operation of the system. 

20 

5.4.2 Precise-Direction Finding 

The precise-direction finder 5 calculates the precise direction of the 
sound source using a full cross-correlation. Block 41 first identifies all possible 
combinations of microphone pairs within the sector. For each microphone pair 
25 identified, block 42 calculates a full cross-correlation, Rij(t), over a time period T 
using the follow formula, a well-known formula to those skilled in the art: 

T 

R«(t) = :E Xi(t)Xj(t-T) 
t=0 

30 

As mentioned before, a correlation calculator is well-known to those skilled in the art 
and may be available as an integrated circuit. Otherwise, it is well-known that such a 
correlation calculator can be built using discrete electronic components such as 
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multipliers, adders, and shift registers (Berdugo et al., "On Direction Finding of an 
Emitting Source From Time Delays," annexed hereto). 

Rij(x) can be plotted as a cross-correlation curve. For each Rjj(T) , 
block 43 finds the delay, x s corresponding to the peak point of the cross-correlation 
curve. Note that this peak-correlation delay t s lies at a sampling point. In reality, 
however, the maximum-correlation point may be located between sampling points. 
Therefore, block 44 calculates such maximum-correlation delay (which may be 
between sampling points), t d , by interpolating the cross-correlation function using a 
parabolic curve (y = p x 2 + q x + r) as follow: 

C(k-l)=Pk 2 + q(k-l) + r 

C(k) =Pk 2 + qk + r 

C(k+l)=Pk 2 + q(k+l) + r 
By solving the above equation for p, q, and r, the maximum point is obtained by 
obtaining the derivative of the parabolic curve and setting the derivative of the 
equation to zero. The maximum point T d is - (l/2p), and is further expressed as 
follow: 

1 C(k-l>C(k+l) 

x d = -(k+( )) 

f s 2(C(k-l)-2C(k) + C(k+l)) 

where f $ denotes the sampling frequency; k denotes the sampling point corresponding 
to x $ ; and C(k) is the delay corresponding to sampling point k. The use of the 
interpolation technique improves the accuracy of the maximum-correlation delay, 
while eliminating the need for using a very high sampling rate. 

Since each maximum-correlation delay calculated for each microphone 
pair indicates the direction of the sound source measured .by individual microphone 
pairs, the individual maximum-correlation delays are combined to estimate an average 
direction of the sound source. The estimation process provides a better indication of 
the source direction than each individual measured directions because it eliminates 
ambiguity problems inherent to each individual pair and provides a mechanism to 
verify the relevancy of the individual measurements by possibly eliminating those 
individual measurements that are far off from the source direction. 

Block 45 calculates the precise direction of the sound source in terms 
of a vector in the Cartesian coordinates, K = (K x , K y , K 2 ), from the vector of 
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individual measured delays, Td, by solving the linear equation between K and Td. 
The time delay between any two sensors is equal to the projection of the distance 
vector between them along the K vector divided by the sound velocity. Thus, the Td 
vector can be expressed as follows: 
5 T d — (RK)/c 

where c is the speed of sound; R denotes the matrix representing the geometry of the 
microphone array in terms of position differences among the microphones as follows: 

[Xa-X,,Y2-Y,,Z r Zi] 

R= [... ] 
10 [ Xm-Xi, Ym-Yi, Zm-Zi] 

Since the above equation is over-determined in that there are more 
constraints than the number of variables, the least-square (LS) method is used to 
obtain the optimal solution. Defining the error as the difference between the 
15 measured time delay vector and the evaluated time delay calculated, the error vector 8 
is given by: 

e=(RK/c) + T d 

The solution depends on the covariance matrix A of the delay measurements which is 
defined by 

20 A = E{T d T d T } - E{T d }E{T d } T = COV {T d } 

where E{} denotes the expected value operator, and {*} T denotes the transpose of a 
matrix. The LS estimated solution, K, is then expressed in the following formula: 
K = -c(R T A' 1 R)- l R T A" 1 T d 
= - c B T d 

25 where denotes the inverse of a matrix. For derivation of the equation, see A. 
Gelb, Applied Optimal Estimation, the M.LT. Press, Cambridge, Massachusetts, 
1974, e.g., at page 103. 

Note that the B matrix depends only on the geometry of the 
microphone array, and thus can be computed off-line, without burdening the 
30 computation requirement during the direction determination. 

Block 46 converts K into polar coordinates. FIG. 5 shows the 3- 
dimensional coordinate system used in the present invention. An azimuth angle, <|>, is 
defined as the angle of the source direction in the horizontal plane, measured 
clockwise from a reference horizonal direction (e.g. x-axis). An elevation angle, 0, is 
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defined as the vertical angle of the source direction measured from the vertical axis 
(z-axis). 

Block 46 calculates <|> and 0 from K x , K y , and K z by converting the 

Cartesian coordinates to the polar coordinates by solving the nonlinear equation 

5 between (K x , K y , K 2 ) and (<|>, 0): 

[K x ] [sin(0)cos(<|>)] 

[K y ]= [sin(0)sin(<|O ] 

[K 2 ] [ cos(©)] 

10 In the case of a 3-dimensional microphone array (with the upper 

microphone), the above equation yields three non-linear equations with two 
unknowns (<|), 0). The problem is over-determined that there are more equations than 
the number of variables. The LS solution for (<|>, 0) has no close-form solution, but a 
suboptimal, closed-form, estimation can be found as: 

15 (j^tan^CKy/Kx) 

0 = tan" 1 (V(K x 2 +K y 2 )/K z ) 

If a 2-dimensional microphone array were used (without the upper 
microphone), block 46 calculates <|> and 0 from K x and K y using the following 
formula: 

20 <t> = tan l (Ky/K x ) 

0 = cos- l (Vl-(K x 2 +K y 2 )) 

Note that the algorithm can function even when the microphones are 
arranged in a 2-dimensional arrangement and still capable of resolving the azimuth 
and elevation. 

25 

5.5 Measurement Qualification Unit 

When the precise-direction finder 5 calculates the precise-direction of 
the sound source, the result may not reflect the true direction of the sound source due 
to various noise and measurement errors. The purpose of the measurement 
30 qualification unit 6 is to evaluate the soundness or validity of the precise direction 

using a variety of measurement criteria and invalidate the measurements if the criteria 
are not satisfied. 
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FIGS 6a, 6b, 6c, and 6d show different embodiments of the 
measurement qualification unit using a different measurement criterion. These 
embodiments may be used individually or in any combination. 

FIG. 6a shows a first embodiment of the qualification unit that uses a 

5 signal-to-noise ratio (SNR) as a measurement criterion. The SNR is defined as a ratio 
of a signal power to a noise power. To calculate the SNR, the measured signals are 
divided into blocks of signals having a predetermined period such as 40 milliseconds. 
Block 61 calculates the signal power for each signal block by calculating the square- 
sum of the sampled signals within the block. The noise power can be measured in 

1 0 many ways, but one convenient way of measuring the noise power may be to pick the 
signal power of the signal block having the minimum signal power and to use it as the 
noise power. Block 62 selects the signal block having the minimum power over a 
predetermined interval such as 2 second. Block 63 calculates the SNR as the ratio of 
the signal power of the current block to that of the noise power. Block 64 invalidates 

1 5 the precise direction if the SNR is below a certain threshold. 

Fig. 6b shows a second embodiment of the measurement qualification 
unit that uses a spread (range of distribution) of individual measured delays as a 
measurement criterion. The precise source direction calculated by the precise- 
direction finder represents an average direction among the individual measured 

20 directions measured by microphone pairs in the sector. Since delays are directly 
related to direction angles, the spread of the individual measured delays with respect 
to the individual estimated delay indicates how widely the individual directions vary 
with respect to the precise direction. Thus, the spread gives a good indication as to 
the validity of the measurements. For example, if the individual measured delays are 

25 too widely spread, it is likely to indicate some kind of measurement error. 

T c is defined as a vector representing the set of individual estimated 
delays x e corresponding to the precise direction, K. Block 71 calculates T c from K 
based on the linear relation between K and T e . 
T c = (-RK)/c 

30 

where R denotes the position difference matrix representing the geometry of the 
microphone array as follows 



[ X2-X1, Y2-Y1, Z2-Z1 ] 
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R- [ ... ] 

[ Xm-Xi, Ym-Yi, Zm-Zi] 

and c is the propagation velocity of sound waves. 
5 Block 72 compares the individual measured delays id with the 

individual estimated delays t e and calculates the spread of individual measured delays 
using the following measure: / 

Se 2 = Z(T d -T e ) 2 

If this spread exceeds a certain threshold, block 73 invalidates the precise source 
10 direction. 

Alternatively, the spread can be calculated directly from the individual 
measured delays using the following: 
Ie 2 = E*T d 

where E = R(R T R)" 1 R T - 1; and I is the identity matrix. 

1 5 FIG. 6c shows a third embodiment of the measurement qualification 

unit that uses the azimuth angle, as a measurement criterion. If deviates 
significantly from the sector direction (the approximate source direction), it is likely 
to indicate that the precise direction is false. Therefore, if <j> is not within a 
permissible range of angles (e.g. within +/- 60 degrees) of the sector direction, the 

20 precise direction is invalidated. 

FIG. 6d shows a fourth embodiment of the measurement qualification 
unit that uses the elevation angle, 0, as a measurement criterion. If 0 deviates 
significantly from the horizontal direction (where 0 = 90°), it is likely to indicate the 
direction of reflected sound waves through the ceiling or the floor rather than that of 

25 direct sound waves. Therefore, if 0 is not within a range of allowable angles (e.g. 
from 30° to 150°), the precise direction is invalidated. 

As mentioned before, the above embodiments can be used selectively 
or combined to produce a single quality figure of measurement, Q, which may be sent 
to a target system such as a controller for a videoconferencing system. For example, 

30 Q may be set to 0 if any of the error conditions above occurs and set to the SNR 
otherwise. 

The direction finding system of the present invention can be used in 
combination with a directional microphone system, which may include an adaptive 
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filter. Such adaptive filter is not limited to a particular kind of For 
example, one can practice the present invention in combination with the invention 
, disclosed in applicant's commonly assigned and co-pending U.S. patent application 
Serial No. 08/672,899, filed June 27, 1996, entitled 'System and Method for Adaptive 
5 Interference Cancelling/ by inventor Joseph Marash and its corresponding PCT 
application WO 97/50186, published December 31, 1997. Both applications are 
incorporated by reference herein in their entirety. 

Specifically, the adaptive filter may include weight constraining means 
for truncating updated filter weight values to predetermined threshold values when 
1 0 each of the updated filter weight value exceeds the corresponding threshold value. 
The adaptive filter may further include inhibiting means for estimating the power of 
the main channel and the power of the reference channels and for generating an 
inhibit signal to the weight updating means based on normalized power difference 
between the main channel and the reference channels. 
\ 5 The weight constraining means may include a frequency-selective 

weight-control unit, which includes a Fast Fourier Transform (FFT) unit for receiving 
adaptive filter weights and performing the FFT of the filer weights to obtain 
frequency representation values, a set of frequency bins for storing the frequency 
representation values divided into a set of frequency bands, a set of truncating units 
20 for comparing the frequency representation values with a threshold assigned to each 
bin and for truncating the values if they exceed the threshold, a set of storage cells for 
temporarily storing the truncated values, and an Inverse Fast Fourier Transform 
(IFFT) unit for converting them back to the adaptive filter weights. 

The adaptive filter in the directional microphone that may be used in 
25 combination with the present invention may also employ dual-processing interference 
cancelling system where adaptive filter processing is used for a subset of a frequency 
range and fixed filter processing is used for another subset of the frequency range. 
For example, one can practice the present invention in combination with the invention 
disclosed in applicants commonly assigned and co-pending U.S. patent application 
30 Serial No. 08/840,159, filed April 14, 1997, entitled 'Dual-Processing Interference 
Cancelling System and Method,' by inventor Joseph Marash, corresponding 
continuation-in-part application, Ser. No. 09/055,709, filed April 7, 1998, and 
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corresponding PCT Application Sen No. PCT/IL98/00179, filed April 14, 1998. All 
three applications are incorporated by reference herein in their entirety. 

It is noted that the adaptive filter processing portion of the dual 
processing may also employ the adaptive filter processing disclosed in applicant's 
5 commonly assigned and co-pending U.S. patent application Serial No. 08/672,899, 
filed June 27, 1996, entitled 'System and Method for Adaptive Interference 
Cancelling,* by inventor Joseph Marash and its corresponding PCT application WO 
97/50186, published December 31, 1997. 

1 0 5.6 Software Implementation 

The present invention described herein may be implemented using a 
commercially available digital signal processor (DSP) such as Analog Device's 2100 
Series or any other general purpose microprocessors. For more information on 
Analog Device 2100 Series, see Analog Device, ADSP-2100 Family User's Manual, 
15 3rd Ed., 1995. 

FIGS. 8A-8D show a flow chart depicting the operation of a program 
in accordance with a preferred embodiment of the present invention. The program 
uses measurement flags to indicate various error conditions. 

When the program starts (step 100), it resets the system (step 101) by 
20 resetting system variables including various measurement flags used for indicating 
error conditions. The program then reads into registers microphone inputs sampled at 
the sampling frequency of 64 KHz (step 102), which is oversampling over the Nyquist 
frequency. As mentioned in Section 5.2, oversampling allows anti-aliasing filters to 
be realized with a much more gentle cut-off characteristic of a filter. Upon reading 
25 every 5 samples (step 103), the program performs a low-pass filter operation and a 
decimation by taking one sample out of every 5 samples for each microphone (step 
104). The decimated samples are stored in the registers (step 105). 

The program performs a bandpass filter operation on the decimated 
samples so that the output contains frequencies ranging from 1.5 to 2.5 KHz (step 
30 106). The output is stored in input memory (step 107). The program repeats the 
above procedure until 512 new samples are obtained (step 108). 

If the 512 news samples are reached, the program takes each pair of 
adjacent microphone pairs and multiples the received signals and add them to obtain 
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zero-delay cross-correlation (step 200), and the results are stored (step 206). The 
calculation of zero-delay cross-correlation is repeated for all adjacent microphone 
pairs, not involving the center microphone (step 201). 

The microphone pair having the highest zero-delay cross-correlation is 
5 selected (step 202) and the value is stored as the signal power (step 207), which will 
be used later. Of those two microphones adjacent to the selected pair, the program 
calculates the zero-correlation (step 203) and the microphone having the higher 
correlation is selected (step 204). The program determines the sector by including the 
selected microphone pair, the neighboring microphone selected, and the center 
1 0 microphone, if there is one. 

The program calculates the average power of the 512 samples taken 
from the center microphone (step 300). The lowest average energy during the latest 2 
seconds is set to be the noise power (steps 301-305). 

The program calculates the full cross-correlation of signals received by 
15 each microphone pair in the sector (step 306). The program finds the peak cross- 
correlation delay, t s , where the correlation is maximum (step 307). 

x s lies on a sampling point, but the actual maximum-correlation delay, 
id, may occur between two sampling points. If x s is either the maximum or minimum 
possible delay (step 308), x d is set to t s (step 309). Otherwise, the program finds the 
20 actual maximum-correlation delays using the parabolic interpolation formula 

described in Section 5.4.1 (steps 310-312). The above steps are repeated for all the 
microphone pairs in the sector (step 313). 

The program uses the B matrix mentioned in Section 5.4.2 to obtain 
the direction vector K = [K x , K y , K z ] from the set of time delays (step 400). 
25 The program then calculates the azimuth angle, <|>, and the elevation 

angle, 0, corresponding to the direction vector obtained (step 401). 

The program calculates the SNR as the ratio of the signal power and 
the noise power (step 402). If the SNR exceeds a threshold (step 403), the program 
raises the SNR Flag (step 404). 
30 The program then evaluates the elevation angle, ©. If © is not within a 

permissible range of angles (e.g. from 30° to 150°) (step 405), the Elevation Flag is 
raised (step 406). 



wo 
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The program calculates corresponding delays from the precise 
direction (step 407). The program calculates a delay spread as the sum of squares of 
the difference between the individual measured delays and the individual estimated 
delays (step 408). If the delay spread exceeds a certain threshold (step 409), the 
5 Delay Spread Flag is raised (step 410). 

The program calculates the quality figure of measurement, Q, as a 
combination of all or part of the measurement criteria above (step 41 1). For example, 
Q may be set to 0 if any of the measurement flags was raised and set to the SNR 
otherwise. 

10 The program transfers <(>, 0, and Q to a target system, such as an 

automatic camera tracking system used in a video conferencing application (step 412). 
The program resets the measurement flags (step 413) and goes back to the beginning 
of the program (step 414). 

While the invention has been described with reference to several 

15 preferred embodiments, it is not intended to be limited to those embodiments. It will 
be appreciated by those of ordinary skill in the art that many modifications can be 
made to the structure and form of the described embodiments without departing from 
the spirit and scope of the invention, which is defined and limited only in the 
following claims. For example, the present invention can be used to locate a 

20 direction of a source transmitting electromagnetic waves. 
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Abstract 

This paper presents a statistically and computationally efficient algorithm for 
direction finding of a single far field source using a multi-sensor array. The algorithm 
extracts the azimuth and elevation angles directly from the estimated time delays 
between the array elements. Hence, it is referred to herein as the Time Delay Direction. 
Finding (TDDF) algorithm. An asymptotic performance analysis, using a small error 
assumption, is conducted. For any 1-D and 2-D array configurations; it is shown that 
the TDDF algorithm achieves the Cramer Rao Lower Bound (CRLB) for the azimuth 
and elevation estimates provided that the noise is Gaussian and spatially uncorrected 
and that the time delay estimator achieves the CRLB as well. Moreover, with the 
suggested algorithm no constrains on the array geometry are required. For the general 
3-D case the algorithm does not achieve the CRLB for a general array. However it is 
shown that for array geometries which obey certain constraints the CRLB is achieved 
as well. 

The TDDF algorithm offers several advantages over the beamfonning 
approach. First, it is more efficient in terms of computational load. Secondly, the 
azimuth estimator does not require the a-priory knowledge of the wave propagation 
velocity. Thirdly, the TDDF algorithm is suitable for applications where the arrival 
time is the only measured input, in contrast to the beamformer, which is not applicable 
in this case. 

FACS# 43,60.011, 43.60.Cg 
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1. Introduction 

In various applications of array signal processing such as radar, sonar and 
seismology, there is a great interest in detection and localization of wideband sources'. 
The problem of estimating the direction of arrival (DO A) of wideband sources using a 
sensor array has been studied extensively in the literature A common approach 2 * 7 
to this problem, for a single source scenario, is to use the time delay estimation 
between two sensors to determine the DOA. Many techniques for estimating the travel 
time delay between two receiving sensors have been investigated , seee.g. 2 ' 7 . For the 
single source and a multi-sensor case, Hahn and Tretter 11 introduced the Maximum 
Likelihood (ML) delay-vector estimator. ML DOA estimators for the multi sensors 
and multi sources case have also being studied extensively 12-14 

It is well known 11 that the ML DOA estimator, for the single source case with 
a spatially uncorrelated noise, can be realized as a focused beamformer. In this paper 
an alternative approach is proposed, in which the DOA is extracted directly from the 
estimated time delays between the array elements (referred to as the time delay 
vector). This approach is an extension to the multi-sensor case, of the work in 10,11 , 
where the DOA is extracted from the time delay between two sensors for the far field 
case. 

The suggested Time Delay Direction Finding (TDDF) algorithm utilizes the 
linear relationship between the time delay vector and the DOA vector in Cartesian 
coordinates. This linear relationship allows a closed form estimation of the DOA 
vector. The transformation to polar coordinates i.e. azimuth and elevation is 
straightforward for l-D and 2-D array geometries . For arbitrarily chosen 3-D array 
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configurations, the best estimator requires a simple nonlinear least squares 
minimization. Alternatively, a closed form suboptimal solution for the 3-D 
configuration is also suggested. Finally it is shown that the TDDF azimuth and 
elevation estimator achieves the CRLtf' provided that the time delay estimators 
achieves the CRLB as well 

2. Methods 

2,1 The Time Delay Direction Finding fTD DFl Algorithm: 

Consider an array of M identical omni-directional sensors with a known 
arbitrary geometry measuring the wavefield generated by a single farfield wideband 
source in the presence of an additive noise. Let r, denote the location of the i-th 
sensor, where r, = [x M y,.*,] for the 3-D array, r, =[x M y § ] for the 2-D case, and 
r, = [x { ] for the 1-D case, and let# andfi denote the azimuth and elevation angles of 
the radiating source, respectively (see Fig. 1). 

Let us now define the differential delay vector, 

*-fcw*». .•• .t, t n] T ; *i/ s */- t p 0)' 

where the first sensor serves as a reference. The signal DOA vector for the far field 
case is given by: 
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sin(0)$ln(0) . 
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The time delay between any two sensors is equal to the projection of the 
distance vector between them along the k vector divided by the sound velocity. 
Consequently, the delay vector can be expressed as follows: 



- R k 

T = ; Re 

c 



(3) 



where c is the wave velocity and the matrix R is composed of the distance vectors 
between all the sensors and the reference sensor. 

The objective is to estimate tc from the measured time delay vector t . 
Studying Eq.(3), it is evident that the problem is overdetermined. Thus, it is 
suggested to apply the least squares (LS) method to obtain the estimation. Defining 
the error as the difference between the measured time difference vector and the 
evaluated time vector (calculated from the assumed k vector), the error vector is 
given by: 

!-(*♦!) 

In the general case, the measurement errors of the time delay vector need not 
be uncorrelated. Hence, the solution depends on the covariance matrix A? of the 
delays measurements which is defined by, 

A, = Z{&} -E{t}E{t} r = COV® , (5) 

where E{} denotes the expected value operator. The problem is "over determined" for 
M>3. The LS solution for k , the DOA. vector, In this case is given by": 



f a w . fftfc A\J*& A 

k = ArgMin M~ +t J A *["7" + ' c 



(6) 
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Thus, estimating the DOA veotor becomes a simple multiplication between the 
measured time delay vector % and a data independent matrix B . The matrix B 
depends on the array geometry (through R) and the time delay covariance matrix 
which under the assumption of spatially uncorrelated noise is known a-priory up to a 
multiplicative factor which cancels out in this equation. Consequently, it can be 
calculated off-line. 

In order to express the DOA vector in terms of azimuth and elevation, one has 

to write the vector k in a polar coordinate representation. For a l-D array 

configuration only k, can be estimated. Hence, assuming horizontal elevation, the 

azimuth angle is given by. 

0 = cos -1 (^). W 
With a 2-D array, both the azimuth and elevation angles can be calculated by: 

0stan- l (£,/Jk,), (8a) 
^cos-'Ck^-cos-^^^+V))" 2 ). (8b) 

For the case of a 3-D array, Eq.(2) yields three non-linear equations with two 
unknowns B), Again the problem is over determined. Thus, the azimuth and 
elevation angles ($,9) can be evaluated as the nonlinear least square estimator 
solving Eq.(2) 



(£,0)=ArgMin -(k-k(ftf)) A^k-kW)) 



(9) 
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A 

where A* is the covarianoe matrix of vector £, which is given in Appendix A 
(Eq.(14)). 

An alternative simplified close-form suboptimal.estimate is proposed by: 

^tan"(£,/Jk,)f W 

In appendix A. the performance of the TDDF algorithm is analyzed, and it is 
shown that it is asymptotically efficient. Furthermore, it is shown that under certain 
geometrical constrains for the sensors arrangement even the closed form 3-D solution 

achieves the CRLB . 

Importantly, it should be noted, that the azimuth estimates given above for the 
2-D and 3-D array configurations, are independent of the wave velocity c, (stems 
from the fact that the solutions are given in terms of the ratio between k, and k t , and 
both are a linear function of c). Therefore, errors in the assumed sound speed will not 
induce errors in the azimuth angle. 

2.1 Performance analygfo 

In appendix A. the covariance matrix for 0,0) is calculated. The performance of the 
TDDF estimator is compared to the theoretical CRLB. It is shown that for the 1-D and 

2- D cases the estimator is asymptotically efficient since it achieves the bound. For the 

3- D case the closed form estimator given in Eq.(iO) is not always efficient. 
Howeverwe we derived constrains on the array geometry in which the CRLB is also 
achieved. 
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3, Results 

In this section, the performance of the TDDF algorithm is demonstrated yia 
numerical simulations and by experimental results. 

3,1 Numerical Simulations 

Simulations were conducted for 2-D and 3-D arrays. The 2-D ' array was 
comprised of randomly located 7 sensors, as shown in Fig. 2(a). In the first set of 
simulations the source was positioned at a fixed location with an azimuth angle of 
60° and an elevation of 30° . The SNR was scanned In the range of -lOdB to +10dB t 
the integration time was 50ms , and the frequency bandwidth was 500-1500H2. The 
noisy estimates of the time delay vectors were generated as Gaussian random vectors 
with a covariance matrix A r given in equation Eq.(i2). The propagation velocity was 
taken to be 340m/sec. 

Five hundred Monte Carlo runs were performed for each SNR value. The 
azimuth angle was calculated with the TDDF algorithm and the corresponding errors 
were computed. The standard deviation of the localization errors was then estimated. 
For comparison, the CRLB was also calculated as explained in appendix A. The 
results axe depicted in Fig.3. For clarity of presentation, only 50 azimuth estimation 
errors are plotted (as small dots) at each SNR level. The standard deviation of the 
TDDF estimator are depicted as circles and the corresponding CRLB is depicted as a 
solid line for the entire range. It can be seen that the standard deviations of the TDDF 
estimator are effectively located on the CRLB line (in accordance with the 
mathematical derivation given in appendix A). 
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In a second set of simulations, the same 2-D array was used. The SNR was 
held fixed at -6dB. The elevation angle was set to 50° , and the azimuth angle was 
scanned in the range of 0° - 360° , The corresponding CRLB line was calculated and 
the standard deviation of the TDDF estimator was evaluated for each angle. Figures 
4(a) and 4(b) display the errors of the TDDF algorithm for the azimuth and elevation, 
respectively. From these curves it is observed that the TDDF estimator achieves the 
CRLB, both in azimuth and elevation, for a 2-D array with an arbitrary geometry. 

In the third set of simulations the 3-D anay shown in Fig. 2(b) was used. This 
amy has 6 sensors equally distributed on a circle with a radius of 01m 1 and the 
seventh sensor Is located 0.1m 1 above the center of the array. The SNR was again held 
fixed at -6dB. The azimuth was set to 20° , and the elevation angle was scanned in the 
range 0° -180°. The following quantities were calculated this time: the CRLB, the 
standard deviation of the TDDF estimator, and the theoretical standard deviation 
calculated from Eq.(25). Figures 5(a) and 5(b) plot these quantities as a function of the 
elevation angle. 

The 3-D array used here obeys the condition given in Eq.(27). Consequently, 
the CRLB is achieved for the azimuth TDDF estimates (Fig.5(a)). For the elevation 
angle, however, the obtained estimate errors are larger than the CRLB. This 
observation is consistent with the fact that the array geometry does not comply with 
condition given in Eq.(29). Nevertheless, the degradation is moderate for .this array 
configuration. This implies that the closed form estimation given by equation set .(10) 
is sufficiently accurate for practical purposes. Finally it can be* observed that the 
estimated errors match the theoretical standard deviations predicted* by Eq.(25), It 
should also be noted that bias was also estimated in all the above simulation and was 
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found negligible (two orders of magnitude smaller than the variance contribution to 
the total error). 

In practical applications it is usually not easy to obtain the optimal estimate for 
the time delay vector x. In the last* numerical example we demonstrate the 
performance of the TDDF algorithm when using a suboptimal estimator for the time 
delay vector as presented in appendix C The time delay vector was estimated via a 
cross-correlation between the reference sensor (#1) and the other sensors. The 
performance of the TDDF algorithm is compared to that of a beamformer. This . 
simulation uses the 2-D array consisting of 3 microphones which is shown in Fig 2c. 
The source direction was set at an azimuth of 30° and an elevation angle of 90°. The 
SNR was scanned in the range of -lOdB to +l0dB. Here, the simulation generates the 
time record of the sensor data assuming spatially uncorrected noise. Both the signal 
and the noise were random Gaussian variables with a bandwidth of 100-3000 Hz. In 
order to perform the beam steering required in the beamformer the data were 
interpolated by a factor of 10. The standard deviation error of both estimators was 
estimated by 100 Monte Carlo runs. The standard deviation of the TDDF estimate are 
.depicted by asterisks in figure 6 while the standard deviation of the Beamformer is 
denoted by circles . As can be seen for most of the studied SNR range the 
performance of the TDDF is the same as that of the beamformer. However, the 
threshold point for the TDDF appears at SNR«-3db which is higher than the threshold 
observed for the beamformer (-6db). This result is not surprising since the TDDF is 
not an ML estimator as the beamformer. Potentially there are two factors that can 
cause the performance of the TDDF to collapse. The first is the time delays vector 
estimation process, and the second is the nonlinear operation for estimating 7. In all 
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the cases we have tested, the time delay estimate- was the first one to diverge. 
Practically it was observed that the cross-correlation function have generated spurious 
peaks at low SNR, and this is probably the main cause for the performance diverges at 
low SNR. 

3.2 Experimental Results 

A 3-D array consisting of 7 microphones (AudioTechnica MT350B) arranged 
in the same configuration depicted in Fig2(b), (Radius-0.1m\ height-Clm'), waa 
used. Two experiments were conducted. In the first experiment the array was located 
in an unechoic chamber (internal dimension of 1.7 xl.7*1.7 mO. In the second 
experiment the array was placed in an ordinary room. The sound source was a 
recorded male voice (Richard Burton) reading a 20-second long sentence. The signal 
was played via a loudspeaker located 1.5 m' from the array. The outputs of the array 
were recorded using an 8-channel tape recorder (Sony-pc208A). The time delay 
between the sensors and the central microphone was estimated by filtering the data by 
a band pass filter 500-1500 Hz, and performing a cross-correlation process. The 
integration time was 40 ms, yielding about 500 independent measurements to estimate 
the system performance. After completion of each set of measurements, the array was 
rotated by 30° and the procedure was repeated. 

The azimuth angles corresponding to each set of measurements was estimated 
using, the TDDF algorithm. The standard deviation of the errors for the TDDF 
estimates was then evaluated. The results are outlined in Fig. 1 , The data from the 
unechoic chamber is denoted by < o' and the data from.the regular room is presented by 
As can be observed the average TDDF error for the unechoic chamber 
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experiment was about 1.5°. The second experiment was held in the regular room, and 
the average error was about 5°. This degradation is attributed to the room 
reverberations and the background noise. We have measured the reverberation time in 
both rooms. In the unechoic chamber thereverberation time was about 10-ms, while 
in the regular room the reverberation time was about 250-ms. Thus, we believe that 
the room reverberation was the major cause for the degradation in the accuracy of the 
direction estimates. 

4. Discussion 

This paper presents and analyzes the Time Delay Direction Finding (TDDF) 
algorithm for a single emitting source using a multi-sensor array. The algorithm 
extracts the azimuth and elevation angles directly from the estimated time delays 
between the array elements. The algorithm, offers computational simplicity as it 
utilizes the linear relationship between the time delay vector and the DOA vector in 
Cartesian coordinates. This linear relationship allows a closed form estimation of the 
DOA vector. 

An asymptotic performance analysis of the TDDF algorithm, using a small 
error assumption is performed. For the l-D and 2-D array configurations it is shown 
that the TDDF algorithm achieves the Cramer Rao Lower Bound (CRLB) provided 
that the time delay vector estimator achieves the CRIB as well. This was proven 
mathematically in appendix A and was demonstrated by numerical simulations. For a 
3-D array configuration a suboptimal closed form estimator is presented (Eq.(lO)). 
Nevertheless, it is shown that when using amy geometries that obey certain constrains 
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the closed- form solution also achieves the CRLB. If the array obeys the condition . 
given by Eq.(27), then the azimuth estimation is statistically efficient. Furthermore, if 
the array also obeys the constrains that are given in Eq.(29), the estimator is efficient 
for the elevation angle as well. 

Numerical and experimental results were given to demonstrate the 
performance of the TDDF algorithm. The experimental results with a 7 microphone 
array have shown that in an unechoic chamber the average TDDF azimuth error was 
about 1,5 degrees, while in a regular room the average error was about 5 degrees. 
These results indicate that the TDDF can serve as a practical tool for passive 
localization of a single radiating source. 

The proposed TDDF algorithm offers several advantages over the popular 
beamforming approach 11 . First, the TDDF algorithm is considerably more efficient in 
terms of computational load. It calculates the azimuth and the elevation angle directly 
from the estimated time delays, and does not involve a two dimensional search over 
the array manifold as the beamformer. 

Secondly, for the 2-D and 3-D amy configurations, the TDDF algorithm does 
not require the a-priory knowledge of the propagation velocity to estimate the 
azimuth, see Eq.(8a) and Eq.(lOa). respectively. This property of the TDDF is very 
useful in acoustic applications where uncertainty in the propagation velocity occurs 
due to wind and temperature variations 17 . This is contrary to the beamformer , which 
uses the wave propagation velocity as input. In principle the beamforming process 
could scan the velocity as an additional unknown parameter, however this would 
substantially increase the computational load as an additional parameter would have to 
be scanned. 
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The third advantage of the TDDF method, arises in applications where the ■ 
signal is a very short transient and the arrival time of the pulse is directly measured by 
the system hardware. Since only the time delays are available in this case the 
beamforming is not applicable. For /TDDF algorithm on the other hand this 
information is sufficient. 

Finally, in certain acoustic and geophysical applications, loss of spatial 
coherence of the signal received at the sensors may occur if the distance between the 
sensors is large 1, l7 ( thus precluding the use of the beamforming approach. In such 
cases, nevertheless, time delay between sensors can atili be estimated via incoherent 
processing means, such as time of arrival difference, and the TDDF algorithm is still 
applicable. 

The TDDF has one major disadvantage. It is limited to a single source 
scenario. The beamformer algorithm on the other hand can localize more then one 
source, provided that the angular separation between the sensors is more then the 
beam-width. 
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Appendix A 

Performance Ajialvsls of the TDDF Algorithm 

In order to evaluate the performance of the TDDF algorithm, analytic 
expressions for the accuracy of the azimuth and elevation estimations for the 1-D, 2-D 
and 3-D cases are first derived. These expressions are then compared to the expression 
of the CRLB as derived by Neilsen. 16 and cited in Appendix B. 

For uniformity and simplicity of notations let us define y = <j> for the 1-D 

sase, and y - (0,0) r for the 2-D and 3-D cases. Under the assumption of small errors 

the covariance matrix of y can be expressed as: 

A r = V, Y • A* • V[y * V>Y • V T fe • A T ■ V T r fc • V T k y (1 1) 

where A r is the covariance matrix of the vector x, and V x y is the gradient (Jacobian) 
of y with respect to x . Again* for a small error assumption it can be verified that the 
DOA estimates are asymptotically unbiased, thus the covariance matrix represents the 
total error of the estimator. 

Clearly, the performance of the TDDF algorithm depends on the covariance 
matrix of the time delay vector. To demonstrate the performance of the TDDF 
algorithm .we shall assume that the time delay vector estimator achieves the CRLB. 
An efficient algorithm for estimation of the time delay vector assuming that both the 
signal and the noise are zero mean uncorrected Gaussian processes, and the noise is 
spatially uncorrected, has been proposed and studied by Hahn and Tretter . Their 
work presents an estimator for the time delay vector which achieves the CRLB, and 
does not requires the beamformer process, Their scheme is based on estimating 
M(M-l) individual time delays via a pre-filtered correlators. The vector t is obtained 
by a linear combination of the individual time delays. The covariance matrix for the 
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time delay vector for this estimator, assuming that the SNR is the same for ail the 
sensors, is given by: 

A t = CRLB f = SL[l^ ♦ W„/] (12) 

Where: 1 M is the MxM Identity matrix; l w is an M-dimensional vector of ones, 



'eM'pWt^] -and p(l)*S(l)/N(l) denotes the SNR at the 
I, i.i 1+Mp(0; 



frequency (C0 6 /) . 

In the following it shall be assumed that A r is given by Eq.(12) i.e. efficient 
estimate of t, 

In appendix B we derive A f for suboptimal time delay estimator via only 
(M-i) correlators, using one sensor as a reference sensor i.e. an efficient estimate is 
only obtained for the separate pairwise delays. It is shown that for sufficiently high 
SNR this estimator also achieves the CRLB. 
First, the co variance matrix of the direction vector Tc is calculated, 
From Eq.(6) 

V[*«cB 03) 

and therefore 

A t =£{HT}=BA T B r c J (14) 
Using the definition of B in Eq.(6) and applying some algebraic simplifications 
yields, 

A k = (XXtR T y l c\ (15) 
Applying the matrix inversion lemma to Eq.(12) it can be written that 
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I T r 
1|W " M 



(16) 



From the definitions of R in Eq.(3) it follows that 



(17) 



Where P is the sensor position matrix defined by: 



P«[x,y,5]» 



*i y\ *\ 



(18) 



*u y» *«. 

Substituting Eq.(l6) and Eq.(l7) into Eq.(15) yields after some algebraic 
manipulations: 



M 



2 



(19) 



Assuming without any loss of generality that the coordinate origin is in the 
center of gravity i.e. P r T M =(5 we finally get the simple expression for the DO A. 
vector covariance matrix, 



A k =(P r P)" , ^.c i (20) 

In the following, the expressions for the accuracy of the azimuth and elevation TDDF 
estimations are derived and compared to the CRLB which is cited in appendix B. 
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A.l A Hnear (UP) aryav configuration 

For the 1-D case y = <fi, and k «= k r = cos(0).Thus 

V k y - - l/sin(# ) ; and A 4 = ■ 1 ■ /:* Inserting into Eq.(l 1) gives 

ff/s ^k?^ cJ (21) 

The CRLB for the 1-D case is given by: CA£5(0) * l/7„ (see appendix B). 
Substituting yi^z^O in this expressions yields: 



As can be observed this expression is identical to the right hand side of Eq.(21), 
indicating that the TDDF estimate achieves the CRLB in this case. 
A.2 A planar (2»D) array configurati on, 

From Eq.(2) and Eq.(8) the Jacobian V k y is given by : 



1 



- sin(^) cos(6) cos(<|)) cos(8) ' 
- sin(e)cos(S) [- cos($)sin(6) - sin(<|>)sin(9) 

Inserting Eq.(22) and Eq,(20) into Eq.(l 1) yields, 



(22) 
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, 1 cosH^)!^ + 2cos(»)sin(<|))Xx l y, +sin > (»)Xy, 1 a\ , . 



and 

, 1 3in : (»)X^ -2cos(^suW)I*,y, +cos 3 (fl)Xy,* ^ g , (23fe) 

Applying a lengthy but straightforward evaluating of the expressions for the CRLB 
Eq.(Al) for both the azimuth and the elevation angles for the 2-D arrays, i.e. zrO; 
shows that they are identical to Eq.(23). Thus, it is concluded that the TDDF 
algorithm is a statistically efficient estimator for 2-D array, which reaches the CRLB. 
It is important to note that no constraints on the array geometry were applied. 

A.3 A spatial f3-DUrrav configuration 

The estimate of y = ($,6) for the 3-D array case involves a nonlinear LS 
minimization Eq.(9). An alternate close form close-form suboptimal estimator was 
suggested in Eq.(lO). Here we calculate the performance of the sub-optimal 
estimator and derive the conditions on the array geometry that guarantee statistical 
efficiency (achieves the CRLB) . 

From Eq.(2) and Eq.(lO) the Jacobian V»y is given by : 



sin(0) cos(0) ' 
sin(G) sin(fl) 
cos(0)cos(0) cos(0)sin(0) -sin(0) 



(24) 



UsingEq.(ll)andEq.(15) 

g^ .^Lx ! 5 '"^) C0S W O](P r P)"(sin(0) costf) 0] r , (25a) 
9 2sin (0) 1 



WO 99/53336 



PCT/US99/08012 



39 

and 



ff*=^l[cos(8)cos(0) cos(0)sln(*)) -sin(0)](P r P)"' • 

[cos(0)costy) cos^)sin(^)) -sin(0)] r 
In general this estimator is not efficient. However if the array obeys the following 
geometrical conditions. 

then the variance of the azimuth angle estimation is given by 

Evaluating the CRLB for the azimuth estimate under the same condition yields an 
identical expression. Thus, under the conditions outlined in Eq.(27), the TDDF 
algorithm is also an efficient estimator for the azimuth angle. When studying the 
conditions for uncoupled estimates of azimuth and elevation angles, Nielsen 16 has 
also reached the same conditions given in Eq.(27) t and gave a few examples of 3-D 
arrays obeying these constraints. 

If we further constrain the array geometry and require a fully balanced array 
configuration, which obeys the following condition, 

X*?=X*, 2 • (29) 
in addition to the conditions outlined in Eq.(27), it can be shown that for the elevation 
estimate, 

c$=^£-c* = CRLB 9 (30) 
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Thus, under the above conditions the TDDF algorithm achieves the CRLB for both 
the azimuth and the elevation angles, and is therefore an asymptotically efficient 
estimator. 
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Appendix B- CRLB for the azimuth and elevation angles 

Nilsen 16 derived analytic expressions for the Cramer Rao Lower Bound for 
the estimation errors of the azimuth angle $ and the elevation angle fi, U9ing 3-D 
arrays. 



(3D 



where 

J4 



J w = Gsin^9)£[x, sinOfr)- * »$(♦)] 
m 

J*< = G l* [*i cos (*) cos(8) + y t sin(<|0 cos(8) - z t sin(8)J 



M 

= G sin(8)£[jc, sinC4>> - y, costf)]^, cos(4>) cos(9) + y, sin((|>) cos(9) - z, sin(6)J 

/■I 

l™« Afp a (/) 



In these expressions the coordinates origin is in the center of gravity of the array, i.e. 
and the coordinate system is given in Fig. 1 
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Appendix C Suboptimal estimation for the time delays 
vector 



which estimates only M-l time delays between the first sensor relative to all the other 
sensors in the array. Each time delay estimation is the based on the data of these two 
sensors only, and ignores the fact it is part of an M sensors array. Efficient estimate 
for the time delay between two sensors can be obtains by maximizing the generalized 
cross-correlation 2 . 

Based on the derivation in 8 the covariance matrix of this estimator* is given 



In general this estimator does not achieves the CRLB > however for high SNR case 
2p » 1 it can be seen that 



Evaluating the CRLB f as given in equation Eq.(12) for the high SNR case yields 

the same expression, i.e. for a good SNR case the suboptimal time delays estimator 
achieves the CRLB T . 



In this appendix we consider a suboptimal estimation of the time delays Vector 



by: 




(32) 
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Figure Captions 

Figure 1: Schematic representation of the model and the coordinate system used 
here. 

Figure 2: (a).The 2-D array geometry consisting of 7 randomly located microphones, 
which was used in the first two numerical simulations, (b) The 3-D array geometry, 
which was used in the third numerical simulation and the experimental measurements, 
(c) The 2-D array consisting of 3 microphones which was used in the last numerical 
simulation. 

Figure 3: Azimuth estimation errors of the simulated 2-D array, as a function of the 
SNR. The source is positioned at an azimuth of 60° and an elevation angle of 30°. The 
solid line is the CRLB. The dots depict the magnitude of the errors of the first 25 
individual runs (out of the 500 used). The circles depict the standard deviation of the 
TDDF estimator 

Figure 4: The errors of the TDDF algorithm for the azimuth (a) and elevation (b) as a 
function of the azimuth, for the 2-D array shown in Fig. 2(a). The SNR is -6 dB, the 
elevation angle is 50°. The solid line is the CRLB, the circles depict the standard 
deviation of the TDDF estimator 
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Figure 5; The errors of the TDDF algorithm for the azimuth (a) and elevation (b) as a 
function of the elevation angle, for the 3-D array shown in Fig.2(b). The SNR.is -6 
dB, the azimuth is 20°, The solid line;8 the CRLB. The circles depict the standard 
deviation of the TDDF estimator, and the analytic variance expressions (Eq.23) are 
depicted as *+\ 

Figure 6: Azimuth estimation errors of the 3 microphone 2-D array used in the last 
simulation, as a function of the SNR. The source is positioned at an azimuth of 30°. 
The standard deviation of the TDDF estimate are depicted by and the standard 
deviation of the Beamformer is plotted by 'o\ 

Figure 7 : Experimentally measured azimuth errors of the TDDF algorithm as a 
function of the azimuth using the 3-D array. The source was a speech signal played 
from a loud speaker 1.5m' away from the array. The circles V denotes the results 
measured in an unechoic chamber, and the '** indicates the results measured in an 
ordinary room, 
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Figure 2 
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What is claimed is: 

1 . A system for finding the direction of a wave source ^comprising: 
an array of sensors arranged in a predetermined geometry, each sensor for 

sensing waves from the wave source and generating signals representing the waves; 
5 an approximate-direction finder, connected to receive the signals representing 

the waves, for processing the signals to find the approximate direction of the wave 

source in terms of the positions of a selected subset of sensors; 

a precise-direction finder, connected to receive information from the 

approximate-direction finder, for finding the precise direction of the wave source by 
1 0 further processing the signals representing the waves based on the approximate 

direction; and 

a measurement qualification unit, connected to the precise-direction finder, for 
evaluating validity of the precise direction using a measurement criterion and 
invalidating the precise direction if the measurement criterion is not met. 
!5 2. The system of claim 1, wherein the wave source is a sound source, and 

the sensors are microphones. 

3 . The system of claim 1 , further comprising: 

a bandpass filter, connected to receive signals from the array of sensors, for 
filtering the signals representing the waves to generate filtered signals containing 
20 frequencies of a specific bandwidth. 

4. The system of claim 1 , wherein the measurement qualification unit 

comprises: 

at least means for calculating a signal-to-noise ratio (SNR); and 
means for invalidating the precise direction if the SNR is below a threshold. 
25 5. A system for finding the direction of a wave source, comprising: 

an array of sensors arranged in a predetermined geometry, each sensor for 

sensing waves from the wave source and generating analog signals representing the 

waves; 

an analog-to-digital converter, connected to the array of sensors, for 
30 converting the analog signals to digital signals; 

a bandpass filter for filtering the digital signals to generate filtered signals 
containing frequencies of a specific bandwidth; 
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an approximate-direction finder, connected to receive the filtered signals, for 
processing the filtered signals to find the approximate direction of jthe.wave source in 
terms of a sensor pair selected among the sensors; 

a precise-direction finder, connected to receive information from the 
5 approximate-direction finder, for finding the precise direction of the wave source by 
further processing the signals representing the waves based on the approximate 
direction; and 

• ' a measurement qualification unit, connected to the precise-direction finder, for 
evaluating validity of the precise direction using a measurement criterion and 
1 0 invalidating the precise direction if the measurement criterion is not met. 

6. The system of claim 5, wherein the array of sensors in a predetermined 
geometry includes sensors arranged in a circular arrangement. 

7. The system of claim 5, wherein the wave source is a sound source and 

the sensors are microphones. 
!5 8. The system of claim 5, wherein the measurement qualification unit 

comprises: 

at least means for calculating a signal-to-noise ratio (SNR); and 

means for invalidating the precise direction if the SNR is below a threshold. 

9. The system of claim 8, wherein the means for calculating a signal-to- 

20 noise ratio comprises: 

means for calculating a signal power for a current signal block of the 

measured signals; 

means for calculating a noise power by finding the niinimum signal power 
over signal blocks within a predetermined time period; and 
25 means for calculating the signal-to-noise ratio by calculating the ratio of the 

signal power of a signal block to the noise power. 

10. The system of claim 8, wherein the precise direction is calculated in 
terms of an azimuth angle and an elevation angle. 

11. The system of claim 10, wherein the measurement qualification unit 

30 further comprises: 

means for invalidating the precise direction of the wave source if the elevation 

angle is not within allowable values. 
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12. The system of claim 10, wherein the approximate-direction finder 
comprises: 

means for calculating a zero-delay cross-correlation for pairs of sensors 

adjacent to each other; and 
5 means for identifying a sector direction by selecting the sensor pair having the 

. • *" 
highest zero-delay cross-correlation. 

13. The system of claim 12, wherein the measurement qualification unit 
comprises: 

means for comparing the estimated azimuth angle with the sector direction; 

10 and 

means for invalidating the precise direction of the wave source if the 
difference between the azimuth angle and the sector direction is not within allowable 
values. 

14. The system of claim 8, wherein the precise-direction finder comprises: 
1 5 means for identifying all sensor pairs within the sector; 

means for calculating individual measured delays by calculating a full cross- 
correlation for every pair of sensors within the sector; and 

means for finding the precise direction by finding a least-square solution of the 
individual measured delays. 
20 15. The system of claim 14, wherein the measurement qualification unit 

comprises: 

means for generating individual estimated delays from the precise direction; 

means for calculating a delay spread by calculating differences between the 
individual measured delays and the individual estimated delays; and 
25 means for invalidating the precise direction of the wave source if the delay 

spread exceeds a threshold. 

16. A system for finding the direction of a sound source, comprising: 

an array of microphones arranged in a predetermined geometry, each 
microphone for sensing sound waves from the sound source and generating analog 
30 signals representing the sound waves; 

an analog-to-digital converter, connected to the array of microphones, for 
converting the analog signals to digital signals; 
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a bandpass filter for filtering the digital signals to generate filtered signals 
containing frequencies of a specific bandwidth; 

an approximate-direction finder, connected to receive the filtered signals, for 
processing the filtered signals to find the approximate direction of the sound source in 
5 terms of a sensor pair selected among the sensors; 

a precise-direction finder, connected to receive information from the 
approximate-direction finder, for rinding the precise direction of the sound source by 
further processing the signals representing the sound waves based on the approximate 
direction; and 

10 a measurement qualification unit for evaluating validity of the precise 

direction using a measurement criterion and invalidating the precise direction if the 
measurement criterion is not met. 

17. The system of claim 16, wherein the array of microphones in a 
predetermined geometry includes microphones arranged in a circular arrangement. 

15 18. The system of claim 16, wherein the microphones are omni 

microphones. 

19. The system of claim 16, wherein the microphones are cardioid 
microphones. 

20. The system of claim 16, wherein the microphones are dipole 

20 microphones. 

21. The system of claim 16, wherein the specific bandwidth is determined 
by the frequency range of sound waves of interest to direction determination. 

22. The system of claim 1 6, wherein the specific bandwidth is determined 
by a calibration process of dynamically adjusting the bandwidth to arrive at the 

25 optimal bandwidth. 

23. The system of claim 16, wherein the measurement qualification unit 

comprises: 

at least means for calculating a signal-to-noise ratio (SNR); and 
means for invalidating the precise direction if the SNR is below a threshold. 
30 24. The system of claim 23, wherein the means for calculating a signal-to- 

noise ratio comprises: 

means for calculating a signal power for a current signal block of the 

measured signals; 
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means for calculating a noise power by finding the minimum signal power 
over signal blocks within a predetermined time period; and 

means for calculating the signal-to-noise ratio by calculating the ratio of the 
signal power of a signal block to the noise power. 
5 25. The system of claim 23, wherein the precise direction is calculated in 

terms of an azimuth angle and an elevation angle. 

26. Thesystemofclaim25,whereinthemeasurementquaUficationunit 

further comprises: 

means for invalidating the precise direction of the wave source if the elevation 

1 0 angle is not within allowable values. 

27. The system of claim 26, wherein the allowable values are between 30 

to 150 degrees. 

28. The system of claim 25, wherein the approximate-direction finder 
comprises: 

15 means for calculating a zero-delay cross-correlation for pairs of sensors 

adjacent to each other; and 

means for identifying a sector direction by selecting the sensor pair having the 

highest zero-delay cross-correlation. 

29. The system of claim 28, wherein the measurement qualification unit 

20 comprises: 

means for comparing the estimated azimuth angle with the sector direction; 

and 

means for invalidating the precise direction of the wave source if the 
difference between the azimuth angle and the sector direction is not within allowable 
25 values. 

30. The system of claim 29, wherein the allowable values are between -60 
and 60 degrees. 

31. The system of claim 23, wherein the precise-direction finder 
comprises: 

30 . means for identifying all sensor pairs within the sector, 

means for calculating individual measured delays by calculating a full cross- 
correlation for every pair of sensors within the sector; and 



WO 99/53336 



PCT/US99/08012 



59 

means for finding the precise direction by finding a least-square solution of the 
individual measured delays. 

32. The system of claim 3 1 , wherein the measurement qualification unit 

comprises: 

5 means for generating individual estimated delays from the precise direction; 

means for calculating a delay spread by calculating differences between the 
individual measured delays and the individual estimated delays; and 

means for invalidating the precise direction of the wave source if the delay 
spread exceeds a threshold. 
10 33 . A computer for processing digital signals representing a sound source, 

sampled from an array of microphones sensing sound waves from the sound source 
and for finding the direction of the sound source, comprising: 
a memory containing: 

a program for finding the approximate direction of the sound source by 
1 5 processing the digital signals to find the approximate direction of the sound source in 
terms of a sensor pair selected among the sensors; 

a program for finding the precise direction of the sound source by further 
processing the digital signals based on the approximate direction; and 

a program for measurement qualification by evaluating validity of the precise 
20 direction using a measurement criterion and invalidating the precise direction if the 
measurement criterion is not met. 

34. The computer of claim 33, wherein the program for finding the 
approximate direction of the sound source comprises: 

a program for calculating a zero-delay cross-correlation for pairs of sensors 

25 adjacent to each other; and 

a program for identifying a sector direction by selecting the sensor pair having 
the highest zero-delay cross-correlation. 

35. The computer of claim 33, wherein the program for finding the 
precise-direction of the sound source comprises: 

30 a program for identifying all sensor pairs within the sector; 

a program for calculating individual measured delays by calculating a full 
cross-correlation for every pair of sensors within the sector; and 
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a program for finding the precise direction by finding a least-square solution 
of the individual measured delays. 

36. The computer of claim 33, wherein the program for measurement 

qualification comprises: 
5 at least a program for calculating a signal-to-noise ratio (SNR); and 

a program for invalidating the precise direction if the SNR is below a 
threshold. 

37. A system for controlling the direction of a microphone to the direction 
of a particular sound source, comprising: 

10 an an-ay of microphones arranged in a predetermined geometry, each 

microphone for sensing sound from the sound source and generating signals 
representing the sound; 

an approximate-direction finder, receiving the signals representing the sound, 
for calculating the approximate direction of the sound source in terms of a pair of 
1 5 microphones selected among the microphones; 

a precise-direction finder, connected to the approximate-direction finder, for 
finding the precise direction of the sound source based on the approximate direction 
found; 

a measurement qualification unit for generating a quality figure of 
20 measurement by evaluating validity of the precise direction using a measurement 
criterion; and 

a controller for controlling the movement of the microphone using the precise 
direction of the sound source and the quality figure; 

38. The system of claim 37, wherein the microphone system is a 
25 directional microphone system. 

39. The system of claim 38, wherein the directional microphone system 
comprises an adaptive filter for suppressing interference. 

40. The system of claim 39, wherein the directional microphone system 
further comprises an array of microphones. 

30 41 . The system of claim 40, wherein the adaptive filter has a weight- 

constraining unit where frequency representations of filter weights are constrained to 
a threshold to suppress directional interferences. 
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42. The system of claim 38, wherein the directional microphone further 
comprises a dual-processing interference cancelling system comprising: 

an adaptive-processing filter for processing a first portion of a frequency band; 

and 

5 a fixed-processing filter for processing a second portion of the frequency 

band. 

43 . The system of claim 42, wherein the adaptive filter has a weight- 
constraining unit where frequency representations of filter weights are constrained to 
a threshold to suppress directional interferences. 

1 0 44. A system for controlling the direction of a camera to the direction of a 

particular sound source, comprising: 

an array of microphones arranged in a predetermined geometry, each 
microphone for sensing sound from the sound source and generating signals 

representing the sound; 
1 5 an approximate-direction finder, receiving signals representing the sound, for 

finding the approximate direction of the sound source in terms of a pair of 

microphones selected among the microphones; 

a precise-direction finder, connected to the approximate-direction finder, for 

finding the precise direction of the sound source based on the approximate direction 
20 found; 

a measurement qualification unit for generating a quality figure of 
measurement by evaluating validity of the precise direction using a measurement 
criterion; and 

a controller for controlling the movement of camera using the precise direction 
25 of the sound source and the quality figure. 

45. A method of finding the direction of a wave source, comprising: 
generating signals representing waves from the wave source; 
processing the signals to find the approximate direction of the wave source in 
terms of a sensor pair selected among the sensors; 
30 finding the precise direction of the wave source by further processing the 

signals representing the waves based on the approximate direction; and 
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qualifying measurements by evaluating validity of the precise direction using a 
measurement criterion and invalidating the precise direction if the measurement 

criterion is not met. 

46. The method of claim 45, wherein the wave source is a sound source, 

5 and the sensors are microphones. 

47. The method of claim 45, further comprising the step of: 
filtering the signals representing the waves to generate filtered signals 

containing frequencies of a specific bandwidth. 

48. The method of claim 45, wherein the step of qualifying measurements 

10 comprises: 

at least the step of calculating a signal-to-noise ratio (SNR); and 

the step of invalidating the precise direction if the SNR is below a threshold. 

49. A method of finding the direction of a sound source, comprising the 

steps of: 

1 5 generating analog signals representing waves from the sound source using an 

array of sensors arranged in a predetermined geometry, each sensor for sensing the 
waves; 

converting the analog signals to digital signals; 

filtering the digital signals to generate filtered signals containing frequencies 
20 of a specific bandwidth; 

processing the signals to find the approximate direction of the wave source in 
terms of a sensor pair selected among the sensors; 

finding the precise direction of the wave source by further processing the 
signals representing the waves based on the approximate direction; and 
25 qualifying measurements by evaluating the validity of the precise direction of 

the wave source using a measurement criterion and invalidating the precise direction 
if the measurement criterion is not met. 

50. The method of claim 49, wherein the step of qualifying measurements 

comprises: 

30 at least the step of calculating a signal-to-noise ratio (SNR); and 

the step of invalidating the precise direction if the SNR is below a threshold. 
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51. The method of claim 50, wherein the step of calculating a signal-to- 

noise (SNR) ratio comprises: 

calculating a signal power for a current signal block of the measured signals; 
calculating a noise power by finding the signal block having the minimum 
5 signal power over signal blocks within a predetermined time period; and 

calculating the signal-to-noise ratio (SNR) by calculating the ratio of the 
signal power to the noise power. 

• 52. The method of claim 50, wherein the precise direction is calculated in 
terms of an azimuth angle and an elevation angle. 
I o 53 . The method of claim 52, wherein the step of qualifying measurement 

further comprises: 

invalidating the precise direction of the wave source if the elevation angle is 
not within allowable values. 

54. The method of claim 53, wherein the allowable values are between 30 

15 to 150 degrees. 

55. The method of claim 52, wherein the step of finding the approximate 

direction comprises: 

calculating a zero-delay cross-correlation for pairs of sensors adjacent to each 

other; and 

20 identifying the sector direction by selecting the sensor pair having the highest 

zero-delay cross-correlation. 

56. The method of claim 55, wherein the step of qualifying measurement 

comprises: 

comparing the estimated azimuth angle with the sector direction; and 
25 invalidating the precise direction if the difference between the azimuth angle 

and the sector direction is not within allowable values. 

57. The method of claim 56, wherein the allowable values are between -60 

and 60 degrees. 

58. The method of claim 49, wherein the step of finding the precise 

30 direction comprises: 

identifying all sensor pairs within the sector, 

calculating individual measured delays by calculating a full cross-correlation 
for every pair of sensors within the sector, and 
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finding the precise direction of the sound source by finding a least-square 
solution of the individual measured delays. 

59. The method of claim 58, wherein the step of qualifying measurement 
comprises: 

5 generating individual estimated delays from the precise direction; 

calculating a delay spread by finding differences between the individual 
measured delays and the individual estimated delays; and 

invalidating the precise direction if the delay spread exceeds a threshold. 

60. The method of claim 49, wherein the microphones are omni 
10 microphones. 

6 1 . The method of claim 49, wherein the microphones are cardioid 
microphones. 

62. The method of claim 49, wherein the microphones are dipole 
microphones. 

15 63. The method of claim 49, wherein the specific bandwidth is determined 

by the frequency range of sound waves of interest to direction determination. 

64. The method of claim 49, wherein the specific bandwidth is determined 
by a calibration process of dynamically adjusting the bandwidth to arrive at the 
optimal bandwidth. 

20 65. A method for processing digital signals representing a sound source, 

sampled from an array of microphones sensing sound waves from the sound source 
and for finding the direction of the sound source in a computer having a memory, 
comprising the steps of: 

finding the approximate direction of the sound source by processing the digital 
25 signals to find the approximate direction of the sound source in terms of a sensor pair 
selected among the sensors; 

finding the precise direction of the sound source by further processing the 
digital signals based on the approximate direction; and 

qualifying measurements by evaluating validity of the precise direction using a 
30 measurement criterion and invalidating the precise direction if the measurement 
criterion is not met. 

66. The method of 65, wherein the step of finding the approximate 
direction of the sound source comprises the steps of: 



WO 99/53336 PCT/US99/08012 

65 

calculating a zero-delay cross-correlation for pairs of sensors adjacent to each 
other; and 

identifying a sector direction by selecting the sensor pair having the highest 
zero-delay cross-correlation. 
5 67. The method of claim 65, wherein the step of finding the precise- 

»r 

direction of the sound source comprises 'the steps of: 
identifying all sensor pairs within the sector; 

calculating individual measured delays by calculating a full cross-correlation 
for every pair of sensors within the sector, and 
10 finding the precise direction by finding a least-square solution of the 

individual measured delays. 

68. The method of claim 65, wherein the step of qualifying measurements 

comprises: 

at least the step of calculating a signal-to-noise ratio (SNR); and 
15 the step of invalidating the precise direction if the SNR is below a threshold. 

69. A method of controlling the direction of a microphone system to the 
direction of a particular sound source, comprising: 

generating signals representing sound waves from the sound source using an 
array of microphones arranged in a predetermined geometry, each microphone for 

20 sensing the sound waves; 

processing the signals to find the approximate direction of the wave source in 
terms of a sensor pair selected among the sensors; 

finding the precise direction of the wave source by further processing the 
signals representing the waves based on the approximate direction found; and 
25 generating a quality figure of measurement by evaluating validity of the 

precise direction using a measurement criterion; and 

controlling the movement of the microphone system using the precise 
direction of the sound source and the quality figure. 

70. The method of claim 69, wherein the microphone system is a 
30 directional microphone system. 

71 . The method of claim 70, wherein the directional microphone system 
comprises an adaptive filter for suppressing interference. 
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72. A method of controlling the direction of a camera to the direction of a 
particular sound source, comprising: 

generating signals representing sound waves from the sound source using an 
array of microphones arranged in a predetermined geometry, each microphone for 
5 sensing the sound waves; 

processing the signals representing sound waves to find the approximate 
direction of the wave source in terms of a sensor pair selected among the sensors; 

* finding the precise direction of the wave source by further processing the 
signals representing waves based on the approximate direction; and 
10 generating a quality figure of measurement by evaluating validity of the 

precise direction using a measurement criterion; and 

controlling the movement of the camera using the precise direction of the 
sound source and the quality figure. 
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